A Comparative Study of Classification Algorithms using Data Mining: Crime and Accidents in Denver City the USA

نویسندگان

  • Amit Gupta
  • Azeem Mohammad
  • Ali Syed
  • Malka N. Halgamuge
چکیده

In the last five years, crime and accidents rates have increased in many cities of America. The advancement of new technologies can also lead to criminal misuse. In order to reduce incidents, there is a need to understand and examine emerging patterns of criminal activities. This paper analyzed crime and accident datasets from Denver City, USA during 2011 to 2015 consisting of 372,392 instances of crime. The dataset is analyzed by using a number of Classification Algorithms. The aim of this study is to highlight trends of incidents that will in return help security agencies and police department to discover precautionary measures from prediction rates. The classification of algorithms used in this study is to assess trends and patterns that are assessed by BayesNet, NaiveBayes, J48, JRip, OneR and Decision Table. The output that has been used in this study, are correct classification, incorrect classification, True Positive Rate (TP), False Positive Rate (FP), Precision (P), Recall (R) and Fmeasure (F). These outputs are captured by using two different test methods: k-fold cross-validation and percentage split. Outputs are then compared to understand the classifier performances. Our analysis illustrates that JRip has classified the highest number of correct classifications by 73.71% followed by decision table with 73.66% of correct predictions, whereas OneR produced the least number of correct predictions with 64.95%. NaiveBayes took the least time of 0.57 sec to build the model and perform classification when compared to all the classifiers. The classifier stands out producing better results among all the classification methods. This study would be helpful for security agencies and police department to discover data patterns and analyze trending criminal activity from prediction rates. Keywords—Data Mining; Classification; Big Data; Crime and Accident

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Analysis of Pre-processing and Post-processing Methods and Using Data Mining to Diagnose Heart Diseases

Today, a great deal of data is generated in the medical field. Acquiring useful knowledge from this raw data requires data processing and detection of meaningful patterns and this objective can be achieved through data mining. Using data mining to diagnose and prognose heart diseases has become one of the areas of interest for researchers in recent years. In this study, the literature on the ap...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Supervised clustering is a data mining technique that assigns a set of data to predefined classes by analyzing dataset attributes. It is considered as an important technique for information retrieval, management, and mining in information systems. Since customer satisfaction is the main goal of organizations in modern society, to meet the requirements, 137 call center of Tehran city council is ...

متن کامل

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Supervised clustering is a data mining technique that assigns a set of data to predefined classes by analyzing dataset attributes. It is considered as an important technique for information retrieval, management, and mining in information systems. Since customer satisfaction is the main goal of organizations in modern society, to meet the requirements, 137 call center of Tehran city council is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016